{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "WfGpInivS0fG"
   },
   "source": [
    "<h2 align=\"center\">点击下列图标在线运行HanLP</h2>\n",
    "<div align=\"center\">\n",
    "\t<a href=\"https://colab.research.google.com/github/hankcs/HanLP/blob/doc-zh/plugins/hanlp_demo/hanlp_demo/zh/lid_restful.ipynb\" target=\"_blank\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\t<a href=\"https://mybinder.org/v2/gh/hankcs/HanLP/doc-zh?filepath=plugins%2Fhanlp_demo%2Fhanlp_demo%2Fzh%2Flid_restful.ipynb\" target=\"_blank\"><img src=\"https://mybinder.org/badge_logo.svg\" alt=\"Open In Binder\"/></a>\n",
    "</div>\n",
    "\n",
    "## 安装"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nf9TgeCTC0OT"
   },
   "source": [
    "无论是Windows、Linux还是macOS，HanLP的安装只需一句话搞定："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "jaW4eu6kC0OU",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "!pip install hanlp_restful -U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_xI_bLAaC0OU"
   },
   "source": [
    "## 创建客户端"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "IYwV-UkNNzFp",
    "outputId": "54065443-9b0a-444c-f6c0-c701bc86400b",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from hanlp_restful import HanLPClient\n",
    "HanLP = HanLPClient('https://www.hanlp.com/api', auth=None, language='zh') # auth不填则匿名，zh中文，mul多语种"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1Uf_u7ddMhUt",
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "#### 申请秘钥\n",
    "由于服务器算力有限，匿名用户每分钟限2次调用。如果你需要更多调用次数，[建议申请免费公益API秘钥auth](https://bbs.hanlp.com/t/hanlp2-1-restful-api/53)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "elA_UyssOut_"
   },
   "source": [
    "## 语种识别\n",
    "语种识别任务的输入为一个或多个文档："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "id": "BqEmDMGGOtk3"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'en'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HanLP.language_identification('In 2021, HanLPv2.1 delivers state-of-the-art multilingual NLP techniques to production environments.')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "SwaPn1hjC0OW"
   },
   "source": [
    "返回对象为[ISO 639-1编码](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes)。HanLP支持返回语种对应的概率（置信度）："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "egpWwHKxC0OX",
    "outputId": "f7c77687-dd75-4fa2-dbd2-be6bda8a3fff"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['ja', 0.9976244568824768]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HanLP.language_identification('2021年、HanLPv2.1は次世代の最先端多言語NLP技術を本番環境に導入します。', prob=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kq_j5TLFC0OX"
   },
   "source": [
    "HanLP也支持返回概率最高的`topk`个语种："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "isJhzYyIC0OX",
    "outputId": "683c8489-dffc-426e-f95b-e91dfb373260"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['zh', 'ja']"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HanLP.language_identification('2021年 HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。', topk=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "该功能对于混合了多个语种的文档而言特别实用："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'zh': 0.3952908217906952,\n",
       " 'en': 0.37189167737960815,\n",
       " 'ja': 0.056213412433862686}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "text = '''\n",
    "2021年 HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。\n",
    "In 2021, HanLPv2.1 delivers state-of-the-art multilingual NLP techniques to production environments.\n",
    "'''\n",
    "\n",
    "HanLP.language_identification(text, topk=3, prob=True)"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "lid_restful.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}